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Abstract 

The initial purpose of this work is to provide a probabihstic explanation of a recent 
result on a version of Smoluchowski's coagulation equations in which the number of ag- 
gregations is limited. The latter models the deterministic evolution of concentrations of 
particles in a medium where particles coalesce pairwise as time passes and each particle 
can only perform a given number of aggregations. Under appropriate assumptions, the 
concentrations of particles converge as time tends to infinity to some measure which bears 
a striking resemblance with the distribution of the total population of a Galton- Watson 
process started from two ancestors. 

Roughly speaking, the configuration model is a stochastic construction which aims at 
producing a typical graph on a set of vertices with pre-described degrees. Specifically, 
one attaches to each vertex a certain number of stubs, and then join pairwise the stubs 
uniformly at random to create edges between vertices. 

In this work, we use the configuration model as the stochastic counterpart of Smolu- 
chowski's coagulation equations with limited aggregations. We establish a hydrodynamical 
type limit theorem for the empirical measure of the shapes of clusters in the configura- 
tion model when the number of vertices tends to oo. The limit is given in terms of the 
distribution of a Galton- Watson process started with two ancestors. 



1 Introduction 

The motivation for this work stems from a recent study of a deterministic model for coagula- 
tion with limited number of aggregations. Specifically, in [S], one considers particles that are 
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determined by a pair of integers (a, k) where k > 1 represents the size and a > the number 
of aggregations that the particle can perform. In the model called sjnnmetric, coagulations 

{{a,k),{a,k')} — > {a + a - 2, k + k') 

occurs at rate 

aa'ct{a, k)ct{a', k') , 

where Ct{a,k) denotes the concentration of particles {a,k) at time t in the medium. Analyti- 
cally, this means that the evolution of concentrations is governed by the following variation of 
Smoluchowski's coagulation equations (cf. the survey by Aldous \2\): 

^ a+l fe-1 

ct{a,k)= \ ^^a'{a - a' + 2)ct{a',k')ct{a - a' + 2,k - k') 



dt 



a'=l k'=l 
oo oo 



— aa'ct{a, k)ct{a', k') 



a'=l k'=l 

where the first term in the right-hand side accounts for the creation of particles (a, k) as 
the result of coagulations of pairs {(a', k'), {a — a' + 2, k — k')} and the second term for the 
disappearance of particles (a, k) after a coagulation with a particle (a', k'). 

One of the main results in [5] is that under appropriate conditions on the initial data that 
we shall recall later on, the concentrations Q(a, k) have a limit as time t tends to infinity which 
is given by 

Coo(a, k) = l^^^,y^-J—^i,*\k - 2) for a G N and > 2 . (2) 

Here, u is a certain probability measure on N with Yl'^=o ^^{'f^) ^ 1 that depends on the initial 
data, and v*^ = z/* - ■ denotes its fc-th convolution power. The expression ([2]) bears a striking 
resemblance with a special case of the celebrated formula due to Dwass [9] who established that 
the total population T2(z/) generated by a (sub)-critical Galton- Watson branching process with 
reproduction law z/ and started from two ancestors is given by 

¥{T2{u) = k)='^u*\k-2), k>2. (3) 

This invites for a probabilistic explanation and provides the incentive for the present work. 

Our approach for relating ([2]) to ([3]) stems from the fact that solutions to the classical Smolu- 
chowski's coagulation equations (without restriction on the number of aggregations) appear as 
the hydrodynamical limit of certain stochastic coalescent models introduced by Marcus and 
Lushnikov. In some loose sense, the latter describe the microscopic random dynamics of the 
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particle system when the macroscopic evolution is governed by Smoluchowski's coagulation 
equations. This important feature has been established rigorously by Norris [18]. We also 
refer to Section 5.2.1 in [4] for an elementary approach in the special case of the multiplicative 
kernel, as the latter bears an obvious similarity with ([T]). On the other hand, it is well-known 
that the multiplicative coalescent is naturally related to the size of the connected components 
in the random graph model of Erdos and Renyi, see in particular the remarkable paper by 
Aldous [1]. This leads us to consider an extension of the random graph model where the se- 
quence of degrees of vertices is given, and which is known as the configuration model. Loosely 
speaking, the configuration model is constructed by an elementary stochastic algorithm which 
aims at producing a random graph on a set of vertices with pre-described degrees; in general 
the resulting graph is not simple, in the sense that there may exist loops and multiple edges. 
Typically, a certain number of stubs is appended to each vertex, and one joins pairwise the 
stubs uniformly at random to create edges between vertices. This induces a natural partition 
of the set of vertices into clusters, i.e. connected components. 

Since its introduction independently by Bollobas [7] and Wormald [20] (see also Bender and 
Canfield [3]), this model has been studied in the mathematical literature by many authors. We 
refer e.g. to [16] for an interesting review of applications of this and other random graph models 
to some real life network systems. The main known results chiefly concern asymptotics when 
the number of vertices is large and the empirical measure of the degrees of vertices converges. 
In particular, Molloy and Reed [H] have determined the critical parameter for the existence of a 
giant component; see also [15] and [17]. In different directions, van der Hofstad, Hooghiemstra 
and co-authors [TDl US [I3] have made deep contributions to the study of distances between 
vertices in such random graphs, while Britton et al. [H] used the configuration model to produce 
large random simple graphs with pre-described asymptotic degree distribution. 

If we neglect the appearance of multiple edges, loops or cycles which do not contribute 
to aggregation of clusters, the configuration model may serve as a stochastic counterpart to 
the deterministic evolution of concentrations in the variant ([1]) of Smoluchowski's coagulation 
equations. This leads us to investigate the size of typical clusters, and more generally their 
combinatorial structures. Roughly speaking, the main result of this work is a hydrodynamical 
limit theorem for the empirical distribution of the shapes of clusters rooted at a generic stub. 
The limit is expressed in terms of a pair of Galton- Watson trees which are connected by an 
extra edge between the two roots. In particular, this yields the probabilistic explanation of the 
formal similarity between the solution ([2]) and Dwass formula ([3]). 

Let us now present some heuristics which are close to some of those that have already been 
used in the literature on configuration models to relate the latter to Galton- Watson processes; 
see in particular [H] and [12] . Imagine that we pick a stub uniformly at random; the degree of 
the vertex to which this stub is appended has then the size-biased law of the degree of a typical 
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vertex. We then pick a second stub uniformly at random to create the first edge. Informally, 
when the number of vertices is large, the degree of the vertex to which the second stub is 
appended has again the size-biased law and is essentially independent of the first. These two 
vertices should be viewed as the ancestors of two growing populations, where, by induction, 
individuals beget independently and with a reproduction law given by the distribution of the 
outer degree of a size-biased vertex. When the reproduction law is critical or sub-critical, 
the Galton- Watson process eventually becomes extinct, and extinction occurs before any loop, 
multiple edge or cycle arises in that cluster of the configuration model. This suggests that 
the combinatorial structure of a typical (not too large) cluster could be described as a pair of 
independent Galton- Watson trees which are connected by an additional edge between the two 
roots. More precisely, the reproduction law should be given by the size-biased degree of a typical 
vertex, shifted by one unit, because the number of children corresponds to the outer-degree of 
the vertex. 

The present work can be viewed as a companion to the recent paper [B], in which we also 
identify in terms of certain Galton- Watson trees the limiting empirical distribution of random 
structures that appear in a toy model for polymerization. More precisely, we consider in |6] a 
system of grabbing particles, where particles consist in monomers having a certain number of 
arms. Arms are activated successively uniformly at random, and each time an arm is activated, 
it grabs a particle uniformly at random amongst those which have not been previously grabbed 
and do not belong either to its own cluster. The main result of [6] is that when the initial 
number of particles is large and the numbers of arms are given by i.i.d. random variables with 
mean less than 1, then the empirical distribution of the shapes of polymers is closed to that 
induced by a Galton- Watson tree with a single ancestor and reproduction law given by the 
distribution of the number of arms of a typical monomer. 

The plan of this work is as follows. The next section is devoted to preliminaries on configura- 
tion models, the combinatorial structure of planar rooted trees, and Galton- Watson processes. 
The emphasis is put on planar structures and their codings by the sequence of degrees via 
breadth-first search. The main result on the empirical distribution of the structures of rooted 
clusters in large random configurations is stated in Section 3 and then proved by explicit first 
and second moments estimates. Finally Section 4 is devoted to some applications. We shall 
point at certain invariance properties of Galton- Watson trees under random re-rooting, and 
conclude by explaining the striking resemblance between the formulas ([2]) and ([3]). 
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2 Preliminaries 



2.1 Pairings, configurations and clusters 

The aim of this work is to relate random configuration models to Galton- Watson trees, and as 
the latter have a natural planar structure, we shall introduce the former in planar setting which 
is tailored for our purposes. In this direction, we should imagine particles as planar star-shaped 
objects consisting in a vertex to which a certain number of stubs are appended. 

Formally, we consider some finite set V of vertices and a map d : V — > N* where d{v) 
represents the degree of the vertex v, that is number of stubs attached to v. We denote by 
S — S{V, d) the set of stubs and shall suppose for the sake of simphcity that the total number 
of stubs 

is even; otherwise we may always decide to add a new stub to some vertex (or to add a vertex 
with a single stub). We call a partition of S into S/2 pairs a pairing of stubs and write Il{S) 
for the set of pairings of stubs. We first point at the following elementary facts. 



Lemma 1 (i) The cardinal of Ii{S) is given by 

s/2 

= wki^''^' ^ n('^ - 2i + 1) . 

(ii) Consider a partition of V into two subsets Vi,V2 such that Si := ^^^^^ rf(t') and S2 : = 
^veVi '^(^) ^'^^ even numbers. Set Si :— S{Vi, d) and S2 ■= S{V2, d). Then the map 

(711,712) — ^ 7ri U7r2 

is a bijection from n(»Si) x 11(52) to the subset ofIl{S) consisting in pairings vr such that there 
are no pairs {si, S2} in formed by a stub Si attached to a vertex in Vi and a stub S2 attached 
to a vertex in Vi. 



Proof: Indeed, a generic pairing can be obtained by enumerating the stubs by {!,..., 5'} 
and then pairing the stubs according to the couples (1, 2), (3, 4), . . . , (5 — 1,S). There are 
5*! possible enumerations and the mapping is 

(5/2)!2'^/2 ^i^g^c (^/2)! accounts for the 

number of permutations of the S/2 couples {2i — l,2i), and 2^^"^ for the number of ways S/2 
unordered pairs can be ordered into couples. This establishes the first claim. The second is 
obvious. □ 
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We then form edges e = {v,v'} with v,v' E V by joining the tips of pairs of stubs {s, s'}, 
where s (respectively, s') is appended to v (respectively, to v'). We stress that an edge is 
unoriented, that it can be a loop (i.e. the two vertices v and v' defining an edge may coincide), 
and that the same edge may appear by joining different pairs of stubs. Each pairing of stubs 
TT yields a configuration 7(7?), that is the family of the S/2 edges induced by the pairing. Note 
that there may be multiple edges; the same edge is repeated in 7(vr) as many times as it arises 
by joining different pairs of stubs in vr. We also stress that the map vr — > 7(7?) is not injective. 

We view an edge which is not a loop as an elementary path connecting two different vertices, 
so a configuration 7(7r) on (V, d) naturally induces a partition of V into connected components. 
Endowing a given connected component with the restriction of 7(7r) to the set of edges formed 
by pairs of vertices in that component, we obtain a cluster. 

2.2 Planar rooted trees and their structures 

Lemma [Tt^ii) enables us to reduce the study of a given cluster to that of pairings vr G n(iS) 
such that the entire set of vertices V is connected for the configuration 7(7?). We shall therefore 
focus on that case in this section. Recall that a cycle is a sequence of ^ > 3 distinct vertices, 
say vi, . . . ,Vi, such that there exists an edge connecting vj and f j+i for every j = 1, — 1 
and also an edge connecting vi and Vi. 

A configuration 7(7r) that connects V is called a tree if it contains no loops, no multiple 
edges, and no cycle. Note that this can occur only when S/2 = #V — 1. Because particles (i.e. 
vertices and the stubs that are appended) can be viewed as planar objects, we may think of 
tree-configurations as planar structures, in the sense that they can be represented in the plane 
in such a way that edges are line segments which do no cross, by attributing lengths to the 
edges in an appropriate manner. Throughout this section, we assume that ^^V = k and that 
the configuration 7(71) is a tree; in particular 7(7r) consists in A; — 1 edges and 5* = 2{k — 1). 

To describe precisely the shape, that is the combinatorial structure, of a tree, we need to 
specify an origin and an orientation. For this, we distinguish a stub s and call it the root. This 
stub is appended to a certain vertex v that we use as the origin. Distinguishing s also enables 
us to order all the stubs attached to v by deciding that the first stub is s and the next (if any) 
are ranked clockwise from that one. Further, for every vertex v' v in that tree, we distinguish 
the stub appended to v' that points at the origin v. This provides a natural order on the set of 
stubs appended to any given vertex of the tree, and thus enables the use of breadth-first search 
to enumerate the vertices of the tree. 

Specifically, set Si = s and Vi = v, define Sj+i as the stub that is paired with the i-th 
stub appended to f 1 for i = 1, . . . , d{vi), and write f j+i for vertex to which Sj+i is appended. 
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We should think of f2, • • • ,Vd(vi)+i as the children of vi. The stub S2 is chosen as the first 
of the stubs appended to V2, thus it is the unique stub pointing at the origin and the other 
stubs attached to V2 are ranked clockwise from S2 and point at the children of V2 (i.e. the 
vertices at distance 2 from the origin vi and at distance 1 from ^2). We denote these d{v2) — 1 
children by Vd{vi)+2, ■ ■ ■ , Vd{vi)+d{v2)+ij and continue with the next children f 3, . . . , Vd{vi)+i of vi 
is an obvious way. Then we proceed with indexing the third generation of vertices, in the order 
which is naturally induced by the indexation of the second generation, and so on. See the figure 
below. 




Figure 1 : Enumeration by breadth-Grst search of the vertices of a planar tree 
rooted at the stub =>. The degree sequence is (3, 4, 1, 1, 3, 3, 1, 1, 3, 1, 1, 1, 1). 

We write di for the degree of the i-th vertex. We stress that for 2 < i < k, the outer-degree 
of Vi, i.e. the number of stubs appended to Vi that point away from the origin, is di — 1. It is 
well-known that the sequence of degrees d = {di, . . . , dk) fulfills 

min{j >l:di + --- + dj = 2{j-l)} = k, (4) 

and characterizes a unique planar rooted tree structure. Conversely, any finite sequence d = 
(di, . . . , dfc) such that (jl]) holds encodes a planar rooted tree structure with k vertices. We 
write D for the set of sequences d = (c/i, . . . ,dk) which fulfill ([1]), where the lenght G N* 
is arbitrary, and think of the set D of sequences of degrees as the set of structures of planar 
rooted trees. We refer for instance to Section 6.2 in Pitman [19] for details. 

We now summarize this discussion, introducing first some terminology for convenience. A 
bijection {1, . . . , fc} V can be represented as a sequence v = (t>i, . . . , Vk) of distinct vertices 
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and will be referred to as an enumeration of V. We also call a map : V — > 5 that associates to 
each vertex f G V a stub s E S appended to that vertex a selection of stubs. For every pairing 
TT G n(iS) such that the configuration 7(7?) on V is a tree and every choice of a distinguished 
stub s E S, breadth first search yields a unique enumeration v = {vi, . . . ,Vk) of V such that 
the sequence diy) = {d{vi), . . . , d{vk)) belongs to D (i.e. fulfills @), and a unique a selection 
of stubs The map 

(7r,s) — > (v,^) 

is bijective. More precisely, we recover the pairing tt and the root stub s by first constructing 
the planar rooted tree structure associated to d = {d{vi), . . . ,d{vk)), and then placing the 
vertices vi, . . . ,Vk on this structure in the order induced by the breadth first search. The first 
stub appended to f 1 is s = <j(f ), and for every i = 2, . . . ,k, <;{vi) is the stub appended to Vi 
which points at the origin Vi. This determines the pairing vr. 

In order to record this analysis, it is convenient to introduce the multinomial coefficient 



M(y,d) : = 




where j is the number of different values, say Xi, . . . ,Xj, occurring in the family {d{v) : f G V), 
and ii the number of occurrences of the value Xi in that family. For every structure d G D, 
we say that d is compatible with {y,d) if there is at least an enumeration v = (fi, . . . ,Vk) of 
V such that d = {d{vi), . . . , d{vk)), that is if and only if the sequence d takes the same values 
with the same multiplicity as the family {d{v) : f G V). The following statement should now 
be plain. 

Lemma 2 Suppose that #V = k and S = 2{k — 1). Fix a rooted planar tree structure d = 
{di, . . . , dk) G D. If d is compatible with (V, d), then the number of pairs (vr, s) G n(iS) x S for 
which the configuration 7(vr) is a tree with structure d when rooted at s equals 

M{V,d) Yld{v). 

Otherwise (i.e. if d is not compatible), this number is 0. 

We stress that all the rooted planar tree structures which are compatible with (V, d) are thus 
equally likely to occur if we choose the pair (vr, s) G n(5) x S uniformly at random. In the 
same vein, it may be also interesting to point at the following simple formula, even though it 
will not be used in this paper . 
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Proposition 1 Suppose that #V = k and S = 2{k — 1) . The number of pairings vr G n(i5) for 
which the configuration ^{tt) is a tree, is 

{k-l)\ \{d{v\ 

Proof: To establish the formula, we simply need to calculate the number of enumerations v 
of V for which the sequence {d{vi), . . . , d{vk)) corresponds to some rooted planar tree structure. 
Recall from the ballot theorem (see, e.g.. Lemma 6.1 in [19]) that for each enumeration v, there 
is a unique cychc permutation cr of {1, . . . , /c} such that {d{ya(i)), ■ ■ ■ , d{va{k))) fulfills (jll). This 
shows that this number is (A; — 1)!. □ 



2.3 Galton- Watson trees with two ancestors 

We consider now a probability measure on N and associate to z/ a measure on D by 

k 

GW^(d) = J]z/(rf,-l), (6) 

i=l 

where d = (c/i, . . . , dk) denotes a generic rooted planar tree structure. 

The measure GW2 has a simple interpretation in terms of Galton- Watson branching pro- 
cesses, and is in fact a sub-probability. More precisely, consider a Galton- Watson process with 
reproduction law u and started from two ancestors. The process can be represented on the 
upper-half plane, where the individuals at generation £ G N lie on horizontal line y = i, in an 
order consistent with that of their respective parents, so that the edges (line-segments) linking 
parents to children do not cross each other. We further connect the two ancestors by an addi- 
tional edge, and distinguish the stub attached to the left-most ancestor that thus points at the 
right- most ancestor. This enables us to list individuals (vertices) by breadth first search just 
as in the preceding section. Observe that the degree of the left-most ancestor (i.e. the origin) 
is distributed as 1 + ^ where ^ is a random variable with law u, whereas the outer-degrees of 
the other individuals (i.e. their numbers of children) are given by independent copies of ^. 

The event when the total population is finite has probability one if and only if the reproduc- 
tion law u is critical or subcritical, i.e. ^jgN^^(0 — ^^"^ ^ 7^ "^i- Restricting our attention 
to this event, the structure of this planar rooted tree is a random variable in D which has 
distribution GW2 and is defective in the supercritical case. 

Remark. In the case when u is the Poisson distribution with parameter p < 1, then it is easily 
checked that the law GW2 also describes the law of the genealogical tree of a Galton- Watson 
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process with reproduction law u, started from a single ancestor, and conditioned to have size 
at least 2. 



3 A limit theorem for typical rooted clusters 

For each fixed integer n, we consider a set V„ of n vertices and a function (i„ : V„ — N* that 
specifies the number of stubs appended to each vertex. We introduce the empirical distribution 
of the number of stubs 

^^{i) ■= -^{v e Vn : dn{v) = i} , ieW. 

n 

We write 

oo 

Sn ■= ^ dn{v) = n ^ ifJ^nii) 

for the total number of stubs, assuming for simplicity that this quantity is even. Our basic 
assumption is that the limit 

lim firi{i) ■= (7) 

n^oo 

exists for every i > 1, and that the average number of stubs 

oo 

1=1 

converges as n ^ oo to the first moment of fi, i.e. 

oo oo 

lim y ifJ-nii) = / := m < oo . (8) 



n— »oo • 

i=l i=l 



We also denote by fi* the probability measure on N* which is obtained from n by size-biased 
sampling, that is 

m 

A standard application of Scheffe's lemma shows that (JTj) can then be re-enforced to 

lim^/i„(z)— = /i*(0 in L\W). (9) 

n^oo ^„ 

Finally, we introduce the probability measure on N induced from fi* by the shift i ^ i — 1 
from N* to N, viz. 

= fi*{i + l) , i>0. 
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We write 5„ for the set of stubs appended to vertices in Vn- We pick a pairing n G n(5„) 
uniformly at random, and denote by r„ := 7(71) the resulting random configuration on (V„, 
For every stub s G Vn, if the cluster of r„ which contains s is a tree, then T, denotes the 
combinatorial structure which results from rooting that tree at the stub s (see Section 2.2), 
and otherwise, we decide that T, = 0. 

We are interested in the random variable 

p„(d) := eSn:Ts = d}, d G D 

which counts the proportion of stubs s such that the cluster rooted at s induced by r„ is a tree 
with structure d. Similarly, we write 

p„(0) := G 5„ : = 0} 

for the proportion of stubs s such that the cluster containing s induced by r„ is not a tree. 
The collection (pn(d) : d G D) should thus be viewed as a variant of the empirical measure of 
tree-clusters. We now able to state our main asymptotic result on large random configurations. 

Theorem 1 Assume that (j7]) and (jH]) hold. Then for every planar rooted tree configuration 
d G D, the following limit holds in L^(P) : 

lim pn{d) = GW2(d) . 

n^oo 

// we further suppose that 

00 

J]2(z-2)/i(z)<0, (10) 

i=l 

and also exclude the degenerate case when fi is the Dirac point mass at 2, then 

limp„(0) = O mL^(P). 

n— >oo 

The condition ffTOj) plays an important part for random configuration models. According to 
a well-known result due to Molloy and Reed [1^, when ffTOj) fails (assuming also some further 
technical conditions), then there is some constant c > such that with probability one, the 
random configuration contains almost surely a cluster of size at least en when n is sufficiently 
large. The size of this giant component is estimated in [15j. At the opposite, when (fTOj) holds 
with a strict inequality (again assuming some further technical conditions), Molloy and Reed 
[T^ have shown that with probability one, the random configuration F„ contains at most n^^^ 
cycles and no cluster of size at least n^^^ whenever n is sufficiently large. Note that in the 
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critical case when ffTOl) is an equality, Theorem [T] implies that the probability that there is a 



cluster of size at least en tends to for any e > 0, because GW2 is a probability measure on 



D. 

The proof of Theorem [T] relies on asymptotics for the first and second moments of Pn(d). 
We first state: 



Lemma 3 We have 



for every d G D. 



lim E(p„(d)) = GW2(d) 



Proof: Let the structure d = (rfi, . . . , rf^) have size k > 2. We write V for a generic subset of 
Vn with k vertices and S' for the set of stubs in Sn which are appended to vertices in V'. There 
are two cases. 

If the unordered families of degrees {d{v') : v' G V'} and {di : 1 < i < A;} do no coincide 
(recall that in such families, numbers are repeated according to their multiplicity), then there 
is no pairing of stubs for which the vertices of V are those of a tree-cluster with structure d 
when properly rooted. We say that V is bad. 

Otherwise, we say that V is good. Introduce the set G' of couples (s, n) E S' x n(5„) such 
that the cluster rooted at s induced by the configuration 7(71) is a tree whose set of vertices 
coincides with V' and has structure d. The cardinal of G' can then be computed by combining 
Lemmas [1] and [21 Since #5' = 2{k — 1), one gets 

where M(d) denotes the multinomial coefficient 

k\ 



M(d) : = 



with j the number of different values in the sequence d and ii the number of occurrences in d 
of the i-th value for 1 < i < j. 

So it remains to estimate the number of good subsets V with k vertices, and for this we 
use a probabilistic argument. We sample uniformly at random k vertices in V„, say, Vi, . . . ,Vk, 
successively and without replacement. It should be plain from the hypothesis that when 
n — i> 00, the /c-tuple of degrees {dn{vi) , . . . dn{vk)) converges in distribution to the fc-tuple 
formed by i.i.d. variables with law /i; in particular the probability that 1), . . . , dn{vk)) = d 



12 



tends to HiLi/^l'^j) as n ^ oo. We readily deduce that the probabihty that the (unordered) 
family {d{vi), . . . , d{vk)} is good converges as n oo to 

I, 



M(d) 



JJ/i(c/i) . 



i=l 



As there are n\ / {n — k)\ ~ fc-tuples of distinct vertices in V„ and as the map that transforms 
a /c-tuple into an unordered set is k\ to 1, we conclude that the number of good subsets in V„ 
is equivalent for large n to 



n 



M(d) 



(12) 



i=l 

Recall from Lemma [TJ^i) that 

# (cS. X n(5„)) = 5„^^£^2-^"/^ 
and that difi{di) = mv{di — 1), by definition. Putting the pieces together, we find 

1=1 

- ^ S-'^'^'^ {Sn/2r' 2'^-' f[{mu{d. - 1)) 

1=1 



n^rf.-!) 



Ck 

^T) ■ 1 

1=1 

By ([6]) and ([8]), this completes the proof. □ 

Lemma [3] essentially means that if we pick a stub s uniformly at random in Sn and indepen- 
dently of the random configuration F^, then the conditional distribution of the combinatorial 
structure of the random cluster rooted at s given the event that this cluster is a tree, converges 
weakly as n ^ oo to the Gallon- Watson law GW2. Theorem [1] is a much stronger statement 
that involves the empirical distribution of structures of clusters, and requires second moment 
estimates. 

Lemma 4 We have 

hm mpn{d)f) = (GW^(d))2 

n— »oo 

for every d G D. 

Proof: The argument is similar to that of Lemma [3l in particular we shall use the same 
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notation and terminology. We start from the expression 



E((p„(d))2) = S,;2e s") G 5„ X 5„ : T,, = = d}) . 

Let V and V" two generic subsets of V„, both with k vertices, and write S' (respectively, 
S") for the set of stubs in 5„ which are appended to vertices in V' (respectively, V") . Note that 
for every stubs s' G S' and s" G S", the identity T^/ = Tg" ^ can occur only if V and V" are 
both good and, either coincide or are disjoint. 

We first consider the situation when V = V". Recall that for any s' G S', if Tg' = d, then 
S' = i5"has exactly 2{k — 1) stubs. It follows from the proof of Lemma [3] that the number 
of triplets (s', s", vr) G 5' x 5" x n(iSn) such that the cluster rooted at s' induced by the 
configuration 7(7r) is a tree whose set of vertices coincides with V' and T^r = Tgn = d, is 
bounded from above by 



i=l 



see ( ITTI) . Multiplying this by the number of good subsets V in Vn, that is approximatively by 
( IT2|l . we get a quantity which is small compared to 

#(5„ X 5„ X n(50) = ^n^;|y^2-^"/2 

when ^ oo. We conclude that in the evaluation of E((p„(d))^), the contribution of pairs of 
stubs {s', s") that belong to the same cluster becomes asymptotically negligible. 

Next we consider the situation when V and V" are good and disjoint. By calculations similar 
to those that yield f|T2l) in the proof of Lemma [3|, we get that the number of good disjoint pairs 
of subsets (V, V") in Vn is equivalent for large n to 



^2k 

M(d)^ 



JJ/i((ii 



,i=l 



It then follows from Lemmas [T] and [2] that the number of triplets [s', s", vr) G 5' x S" x n(iS„ 
such that Tg/ = Tgn = d and the stubs s' and s" belong to disjoint clusters is close to 



2k ('S'n. 4(A: ^)V- rt-Sn/2+2k~2 

(5„/2-2fc + 2)! 



vi=l 



14 



Putting the pieces together yields the estimate 



E((p„(d)) ~ Q irc /9_9i.^9^1 2 11^.^^. 



,i=l 



52 5J(5„/2-2A; + 2)! 

2fc / 

^ (^„/2)2^-^ 2^^-^ ( n("^Krf. - 1)) 

k 

,i=l 

By ([6]), this shows our claim. □ 

We are now able to establish Theorem [H 

Proof of Theorem [It Combining Lemmas [3] and lU we see that the variance of p„(d) tends 
to as n ^ 00, which establishes the first claim. Assume now further that (fTOl) holds and that 
H 62- Equivalently, this means that the reproduction law u of the Galton- Watson process is 
critical or sub-critical, and is not the Dirac mass at 1. So extinction occurs a.s. and 



^GW^(d) = l. 



deD 

As 

p„(0) = 1- 5^p„(d), 

deD 

Fatou lemma entails our second assertion. □ 



4 Some applications 

In this Section, we shall develop some consequences of our main result. Recall that Theorem 
[1] implies that if one selects a stub uniformly at random and independently of a large random 
configuration that fulfills the conditions there, then the structure of the cluster rooted at that 
stub has asymptotically the distribution GW2. This hints at an interesting property of invari- 
ance of such Galton- Watson trees under uniform random re-rooting. Recall the construction 
of the structure of a planar tree rooted at some stub as it has been presented in Section 2.2; 
Figure 2 below should explain better than words what is meant by re-rooting a rooted planar 
tree at some stub. 

Corollary 1 Suppose that u is a critical or suhcritical probability measure on N with v ^ 6\. 
Let D be a random rooted planar tree structure with distribution GW2. Conditionally on D, 
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select one of the 2{\D\ — 1) stubs of D uniformly at random, and denote by D' the new structure 
obtained from D by re-rooting at that stub. Then D' has again the law GW2. 




Figure 2 : Two genealogical trees, both with two ancestors lying at the lowest level. 
The left-most ancestor serves as the origin, the root-stub pointing at the right-most ancestor. 
The tree on the right is the image of the tree on the left by re-rooting at the stub =>. 
Vertices are labeled by breadth first order before re-rooting. 

Proof: Re-rooting lias no effect on tlie degree of a vertex, so we only need to verify the 
statement for the conditional law of the Galton- Watson genealogical tree with two ancestors 
given the unordered family of the degrees of vertices. 

Fix some unordered family, say A, of k positive integers (with possible repetitions), which 
add up to 2{k — 1) and such that z/(5 — 1) > for any integer 6 in that family. Denote by D(A) 
the subset of rooted planar tree structures corresponding to some ordering of A. We see from 
([6]) that the conditional law GW2(- | D(A)) is simply the uniform distribution on D(A). 

Next consider the random configuration on a set k vertices with degree family A that is 
induced by uniform random pairing, given that this configuration is a tree. Then root the 
configuration using some stub that is picked independently and uniformly at random. On 
the one hand, by construction, the law of the resulting combinatorial structure is obviously 
invariant by uniform random re- rooting. On the other hand, we see from Lemma [2] that it also 
coincides with the uniform distribution D(A). This established our claim. □ 

We also refer to the recent work by Haas et al. [llj and references therein for a different 
property of invariance under uniform re-rooting for certain classes of random continuous trees. 
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It may be interesting to point also at the following avatar of Corollary [T] A planar rooted 
tree is said planted if the degree of the origin, i.e. of the vertex to which the root-stub is 
appended, is 1. In other words, the combinatorial structure d = {di, . . . , dk) fulfills di = 1. So 
a planted Galton- Watson tree describes the genealogy of a population where individuals beget 
independently with the same reproduction law, except the ancestor who has exactly one child. 
An easy consequence of Corollary [1] is that in the critical or sub-critical case, the structure of 
a planted Galton- Watson tree is statistically invariant under re-rooting at a leaf (i.e. a vertex 
with degree 1) chosen uniformly at random. 

We next turn our attention to some quantitative consequences of Theorem [1], denoting for 
every /c > 2 by C„(/c) the number of clusters of size k in the random configuration r„, i.e. the 
number of distinct connected components with k vertices in the partition of V„ induced by r„. 

Corollary 2 Assume that ([7]), ([8]) and (flOl) hold, and exclude the case when ^ = 62- We have 



lim V kE 



k=2 



0. 



Proof: Let us introduce first for every k > 2 the subset of D consisting of structures of 
rooted planar trees d = {di, . . . , dk) of lenght k, and recall that according to Dwass [9], 

GW^(D,) = ^z/*'=(A;-2), 

where u*'' stands for the k-th convolution power of u. As a tree of size k has exactly 2{k — 1) 
stubs and is a finite set, we deduce from Theorem [1] that if we denote by Tn{k) the number 
of clusters which are trees of size k, then 

lim ^^^^tM = l^*\k - 2) , 

where the convergence takes place in L^(P) and for every k > 2. Then we pick an arbitrary 
sequence of integers that tends to 00, from which we can excerpt by a diagonal extraction 
procedure a subsequence such that with probability one, 

lim ^V n(fc) = 7Z/*^(A; - 2) for all k>2, 

where the notation n -w cc means that n tends to infinity along that subsequence. 
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Then observe that for each n, there is the obvious inequahty 



J]2(A;-l)r„(A;) <5„ 



k>2 



while 



since the reproduction law u of the Galton- Watson process is critical or sub-critical and u ^ 6i. 
A standard combination of Fatou and Scheffe lemmas entails that 



lim 



k=2 



Next, note that Tn{k) < Cnik) and X]fc>2 2(^ ~ l)Cn(^) < 'S'„ as at least 2{k — 1) distinct 
stubs are needed to connect k vertices. It follows that 



EE 

k=2 



2{k-l) 



{C^{k) - T^{k)) 



,fc=2 



Sr, 



< 



2(k - i; 



Tn{k)) 



and we know from above that this quantity tends to as n cxd. 
This shows that 



lim Ve 



k=2 



^-^^CM-l-*'ik-2) 

On rv 



0, 



and since by the assumption ([8]), Sn/n m, we have thus proved that 

1 



lim V kE 

n~~>oo ' ^ 



k=2 



m 



u*%k-2) 



0. 



As the sequence of integers tending to infinity that we started from is arbitrary, this establishes 
our claim. □ 

Corollary [2] provides the explanation for the asymptotic behavior ([2]) that motivated this 
work. Specifically, we know from Theorem [1] that when the requirements ([7]), ([8]) and (fTOj) are 
fulfilled, then, roughly speaking, multiple edges, loops or cycles are rare. Roughly speaking, this 
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means that almost all creation of edges correspond to aggregations of clusters and thus enables 

us to view the configuration model as a stochastic microscopic version of the terminal state of 

concentrations with a deterministic evolution governed by the variant ([T]) of Smoluchowski's 

coagulation equations. In one assumes that initially all particles are monomers, i.e. consist 

in isolated vertices to which some stubs are appended. In the notation of the present work 

(beware that this differs from that in [5]!), the initial concentration of particles with i > 1 

stubs is m~^fi{i), which is a finite measure on N* with unit first moment. In the framework 

of the random configuration model with n vertices, this corresponds to assuming that particles 

live in a volume mn and hence the the initial concentration of monomers with i stubs is given 

by ^ 

m-Vn(0 = e Vn : dn{v) = i}, keN* . 

mn 

After the random pairing, the concentration of polymers with size k (i.e. clusters with k 
vertices) is then (mn)~^C„(fc) and Corollary [2] shows that 

hm —Cnik) := cUO, k) = —^u*'{k - 2) . 
n^oo mn k[k — 1) 

One has thus recovered ([2]). 

We also note that Corollary [2] solves a problem that has been addressed in Section II. C of 
[T7] by analytic and numerical technics. 

Acknowledgment. We would like to thank Maria Eulalia Vares for stimulating discussions 
which have been at the origin of this work. 
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